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Abstract: Severe acute respiratory syndrome (SARS) is an emerging infectious disease caused by a novel human 
coronavirus (CoV). During the 2003 epidemic, the disease rapidly spread from its origin in southern China to other 
countries and affected almost 8000 patients, which resulted in about 800 fatalities. A chymotrypsin-like cysteine protease 
named 3C-like protease (3CL”*) is essential for the life cycle of the SARS-CoV. This main protease is responsible for 
maturation of functional proteins and represents a key anti-viral target. HPLC and fluorescence-based assays have been 
used to characterize the protease and to determine the potency of the inhibitors. The fluorogenic method monitoring the 
increase of fluorescence from the cleavage of a peptide substrate containing an Edans-Dabcyl fluorescence quenching pair 
at two ends has enabled the use of high throughput screening to speed up the drug discovery process. Several groups of 
inhibitors have been identified through high throughput screening and rational drug design approaches. Thus, o,B- 
unsaturated peptidomimetics, anilides, metal-conjugated compounds, boronic acids, quinolinecarboxylate derivatives, 
thiophenecarboxylates, phthalhydrazide-substituted ketoglutamine analogues, isatin and natural products have been 
identified as potent inhibitors of the SARS-CoV main protease. The different classes of inhibitors reported in these studies 
are summarized in this review. Some of these inhibitors could be developed into potential drug candidates, which may 
provide a solution to combat possible reoccurrence of the SARS and other life-threatening viruses with 3CL proteases. 
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GENERAL INTRODUCTION 


Severe acute respiratory syndrome (SARS) is an atypical 
pneumonial infection featured by non-productive cough, 
high fever and headache that may progress to generalized 
interstitial infiltrates in the lung. This recently emerging 
disease first occurred in the Guangdon province of China in 
late 2002 and subsequently spread to over 25 countries in 
2003. SARS is caused by a novel human coronavirus (CoV) 
named SARS-CoV [1-6]. This virus belongs to the 
coronaviridae family (see Fig. (1) for sequence homology of 
their main proteases), which includes porcine transmissible 
gastroenteritis virus (TGEV), human coronavirus (HCoV) 
229E, mouse hepatitis virus (MHV), bovine coronavirus 
(BCoV), and porcine epidemic diarrhea virus (PEDV) (7-9). 
These coronaviruses are large, enveloped, positive single- 
stranded RNA viruses containing 27-31 kb genomes, which 
cause respiratory and enteric diseases in humans and other 
animals. The SARS-CoV genome comprises of about 29,700 
nucleotides which encode non-structural proteins and 
structural proteins. Two overlapping replicase polyproteins, 
ppla (486 kDa) and pplab (790 kDa), mediate all the 
functions required for viral replication and transcription 
[10,11]. These non-structural polyproteins are autocatalyti- 
cally processed through the virally encoded main protease 
and papain-like protease to yield mature proteins including 
the RNA-dependent RNA polymerase (RdRp), the RNA 
helicase, and other proteins whose functions are not well 
characterized [12] (Fig. (2)). The main protease is called 3C- 
like protease (3CL™°) since it is analogous to the 3C 
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proteases encoded by picornaviruses [13]. Due to its pivotal 
role in the SARS-CoV life cycle, the 3CL”° has been 
considered to be a promising target for anti-SARS drug 
discovery. 


Other than the non-structural proteins, the structural 
proteins of coronaviruses including S (spike), E (envelope), 
M (membrane), and N (nucleocapsid) proteins function 
during host cell entry and virion morphogenesis and release 
[14]. S protein on the surface of the virus is a membrane 
glycoprotein responsible for virus attachment to the host 
receptor which was identified to be angiotensin converting 
enzyme II on human cells for SARS-CoV [15]. N binds to a 
defined packing signal on viral RNA, leading to the 
formation of the helical nucleocapsis. M is localized at 
intracellular membrane structures and interact with 
nucelocapsid to form a viral core structure. The conserved 
segments of the S, E, M, and N proteins can be found in 
SARS-CoV and related coronaviruses [2]. These proteins 
including N, E, M, and truncated forms of the S (S1-S7) of 
SARS-CoV have been expressed in Escherichia coli and six 
proteins N, E, M, S2, S5, and S6 were used for Western blot 
to detect various immunoglobulin classes in serum samples 
from probable SARS patients [16]. The results indicated that 
N was recognized in most of the sera. In some cases, S6 
could be recognized as early as 2 or 3 days after illness 
onset, while S5 was recognized at a later stage. 


SARS-CoV 3CL™ cleaves ppla at the predicted 11 
conserved sites with a conserved sequence of (Leu,Met,Phe)- 
Gln@(Ser,Ala,Gly), in which a PI glutamine residue 
invariably occupies the S1 subsite [17] (see the bottom panel 
in Fig. (2)). The protease is a chymotypsin-like protease but 
uses a Cys rather than a Ser residue as the active site 
nucleophile [18]. Moreover, the active site of the SARS 
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Fig. (1). Sequence homology of the 3CL"° from a group of coronaviruses including SARS-CoV, HCoV 229E, TGEV, PEDV, MHV, and 
BCoV. The numbering is according to that for SARS 3CL””. Catalytic dyad C145 and His41 are totally conserved in all the sequences. 


protease comprises a catalytic dyad Cys145 and His41 rather 
than a triad [18,19]. The protease contains three domains and 
the active site is located between domain I and II. Several 
crystal structures of coronavirus 3CL’ (apo form or with 
suicide inhibitors) reported from TGEV, HCoV 229E and 
SARS-CoV (18, 20-22) reveal a common feature in 3CL?”: 
two chymotrypsin-like B-domains (residues 1-184) and one 
Q-helical dimerization domain (residues 201-303). The 


additional helical C-terminal domain of about 100 residues, 
absent from the analogous picornavirus 3C protease and 
chymotrypsin, is essential for dimerization of the 3CL”® and 
its enzymatic activity [20,23]. In addition to the C-terminus, 
the N-finger containing a number of N-terminal amino acids 
is important for enzyme activity of the main proteases from 
TEGV and SARS-CoV since the deletion of the N-finger 
abolishes enzyme activity [20,24]. 


SARS-CoV Protease 
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Fig. (2). The two SARS-CoV polyproteins, ppla and pplab (ppla + pp1b), with 3CL"® cleavage sites indicated in arrows. The amino acid 
numbering of the polyproteins is marked at the top. The eleven cleavage sites with conserved amino acid sequences are shown in the bottom 
panel. The proteolytic processing at different sites could result in maturation of nsp (non-structural proteins). Some predicted end-products of 
the cleavage including 3CL’” itself, RARp (RNA-dependent RNA polymerase) and helicase are shown. 


HPLC and fluorescence-based methods have been used 
to characterize the protease kinetics and inhibition. The 
HPLC method is used to monitor the formation of product 
peaks from the substrate peak [17]. A fluorogenic substrate 
is used in the fluorescence assay and the increase of 
fluorescence from the cleavage of a substrate that contains 
fluorescence quenching pair at the N- and C-termini is 
measured [25]. The latter method is fast, can be automated, 
and is thus suitable for high throughput assay. Many 
inhibitors have been discovered through high throughput 
random screening and rational design by using either the 
HPLC or the fluorescence-based assay (vide infra). 


In this review, the assay methods, the dimerization of the 
SARS-CoV 3CL?”, the 3-D structures of the protease, and 
the inhibitors identified so far are summarized and discussed. 
There is currently no effective treatment for the SARS 
disease. A combination of Ribavirin for antivirus and 
corticosteroids for immunomodulation has been used to treat 
SARS patients [26-30]. However, Ribavirin at non-toxic 
concentration has little in-vitro inhibitory activity against 
SARS-CoV [31]. Improved clinical outcome has been 
reported for patients receiving early administration with the 
HIV drug Kaletra plus Ribavirin, and the corticosteroids [32] 
or Lopinavir/Ritonavir [33]. Human interferons were also 
reported to be effective against SARS [31,34], but there is no 
clear evidence to support the clinical observation. The data 
summarized here serve as a firm basis for therapeutic method 
development to deal with the possible reoccurrence of SARS 


in the future and may also lead to new drugs for other viral 
diseases caused by the viruses with similar proteases. 


ASSAY OF THE PROTEASE ACTIVITY 


The recombinant SARS main protease has been initially 
expressed and purified by different laboratories [17,25,35, 
36]. Some recombinant forms of the protease contained C- 
terminal Hexa-His tag used for Ni-NTA column 
chromatography or N-terminal extra amino acids left from 
incomplete tag removal using thrombin cleavage [17,35,36]. 
Some laboratories used FXa with its cleavage site engineered 
in the N-terminus of the protease to remove the affinity tag 
and yielded recombinant protease with authentic amino acid 
sequence [25]. These recombinant proteins have different 
properties especially in dimerization as described below. 


A HPLC method was first used to assay the activity of 
the protease. Products generated from a peptide substrate 
such as HN-TSAVLQ@SGFRKW-COOH (the cleavage site 
is indicated with 1) can be separated by using a reverse- 
phase HPLC column and a linear gradient of acetonitrile 
[17]. The absorbance can be measured at 215 or 280 nm and 
the peak areas can be integrated to calculate the rate of 
protease reaction. The kinetic parameters were determined 
by fitting the data with the Michaelis-Menten equation. The 
11 peptides corresponding to the possible cleavage sites (Fig. 
(2)) of the SARS main protease on the ppla and pplab 
polyproteins were tested as substrates for the protease and 
the peptides spanning the protease’s own N- and C-termini 
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were the best substrates [17]. The conserved core sequence 
of the native cleavage sites of the protease was confirmed to 
be optimal for high hydrolytic activity in a more detailed 
study using 34 synthetic peptides as substrates [37]. Amino 
acids at position P3, P4 and P3’ were found to be critical for 
substrate recognition. Increasing the B-sheet character of the 
substrates was also important. 


However, for high throughput screening to identify 
inhibitors, a more convenient assay method was required. 
Our laboratory has utilized a fluorogenic substrate (Dabcyl- 
KTSAVLQSGFRKME-Edans) which can be cleaved by the 
protease resulting in an intense increase in fluorescence 
[25,38]. The two fluorescent groups Dabcyl and Edans form 
a quenching pair and the fluorescence of Edans is reduced by 
Dabcyl, which is at a proximal short distance (in this case 14 
amino acids away), but the fluorescence becomes high when 
the peptide is cleaved by the protease. This fluorescence 
resonance energy transfer (FRET) technology has been 
demonstrated to be useful for assaying retroviral proteases 
[39]. The fluorescence-based assay has become the method 
of choice to evaluate the potencies of inhibitors from high 
throughput screening. Some of the fluorogenic substrates 
used for the assay include Dabcyl-Leu-Ala-Gln-Ala-Val- 
Arg-Ser-Ser-Ser-Arg-Edans [35], Abz-Ser-Val-Thr-Leu-Gln- 
Ser-Gly-Tyr(NO,)-Arg [40], and Abz-DNP quenching pair 
[41]. 


DIMERIZATION OF THE PROTEASE 


There is a number of reports in the literature which show 
that a protease can exist as a monomer or a dimer and only 
the dimer is active [42-44]. Size exclusion, the 
measurements of activity versus enzyme concentration, and 
analytic ultracentrifugation (AUC) are examples of tools that 
have been used to measure the monomer-dimer equilibria. 
However, there is controversy on the reported dissociation 
constants of the dimeric SARS main protease. As mentioned 
above, different recombinant forms of the SARS 3CL?° 
containing no or extra amino acids at either the N- or C- 
terminus have been used for these studies. The properties of 
these recombinant proteases are somewhat different. This 
discrepancy seems to result from the different monomer- 
dimer equilibria probably due to the interference of those 
extra amino acids in dimer formation, different pH, different 
protein concentration, and/or the tools of measurement used. 
A Kg of 100 uM was determined by Fan ef al. for the 
recombinant SARS protease containing a C-terminal His, 
using the gel filtration size exclusion method [17]. The 
enzyme existed as a mixture of monomer and dimer at a 
higher protein concentration (4 mg/mL ~118 uM) and 
exclusively as a monomer at a lower protein concentration 
(0.2 mg/mL ~6 UM), as revealed by analytical gel filtration. 
The dissociation constant Ky of the dimer was thus estimated 
to be 100 uM. The recombinant proteins prepared by Bacha 
et al. contained extra amino acids at the N-terminus and their 
enzyme existed largely in the monomeric forms, similar to 
that observed by Fan et al. [35]. 


Using the fluorescence assay, we determined the Kg of 
the protease which had authentic amino acid sequence (i.e., 
without extra amino acid residues) [25]. An enzyme 
concentration range of 5-150 nM or 50-3000 nM with 60 
ULM fluorogenic substrate were used to determine the Kg. The 
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plot fitting of reaction rate versus enzyme concentration 
becomes non-linear when the enzyme concentration 
approaches the dimer Kg, verifying the monomer is inactive, 
and the apparent Kg value for the dimer-monomer 
equilibrium of our enzyme was measured to be 15 nM. Thus, 
this Kg (15 nM) is remarkably smaller than the one (100 uM) 
previously estimated from the analytical gel filtration 
experiments. 


We also performed size exclusion chromatography and at 
both high (4 mg/mL) and low (0.2 mg/mL) enzyme concen- 
trations, and the SARS main protease showed a predominant 
peak of the dimer. Analytical ultracentrifugation (AUC) 
method was utilized to examine the quaternary structures of 
the wild-type 3CL™° and the C145A mutant protease (to 
prevent autoactivation) containing additional N- or C- 
terminal segments of the polyprotein sequences to compare 
the Kg values of their dimers [22]. The AUC data for the 
wild-type SARS protease indicate that the determined 
molecular weight is that of a dimer and the dimeric wild-type 
protein has a Kg of 0.35 nM. In contrast, the recombinant 
protease, which contains 10 extra amino acids belonging to 
its own self-cleavage site at the N- or C-terminus, shows 49- 
and 16-fold larger Kg values (17.2 nM and 5.6 nM), 
respectively. Therefore, even with only 10 extra amino acids 
in the N- or C-terminus, dimer formation was impeded. 
However, the sub-UM Kg value of the protease indicates that 
immature 3CL”” can form a small amount of dimer enabling 
it to undergo autoprocessing to yield the mature protein, 
which further serves as a seed for facilitated maturation [22]. 
From the above studies, the extra amino acids at the N- 
and/or C-terminus of the protease apparently affect the dimer 
formation. 


However, using the AUC method, Chou ef al. reported a 
Ky of 190 nM at pH 8, even though the recombinant protease 
contained a His¢-tag at its C-terminus [45]. On the other 
hand, deletion of N-terminal amino acids increased the Kg 
and decreased the protease activity, suggesting that the N- 
finger is important for dimer formation. The sequential 
deletion of the first 3 and 7 residues at the N-terminus 
caused a 12- and 1275-fold increase in dimer Kg, 
respectively [24]. Particularly, the Arg4 is the most 
important one for dimer formation since deletion of the first 
3 residues caused only a 12-fold increase in Ky, whereas 
deletion of the first 4 residues caused a 205-fold increase in 
Ky. From the crystal structure, the N-terminal residues 1-7 
(N-finger) is buried in the dimer interface with numerous 
contacts with the domain II close to the active site of the 
other protomer [21,22]. In conclusion, the extra exogenous 
amino acids and the lack of the first 4 residues at the N- 
terminus cause a greater impact on protease dimerization. 


3-D STRUCTURES OF THE PROTEASE 


The first crystal structure of SARS-CoV main protease 
and its complex with a substrate-like hexapeptidyl 
chloromethylketone (CMK) inhibitor were reported by Rao 
and his coworkers [21]. Analogous to the previously solved 
structure of coronavirus main protease from TGEV [18], the 
SARS protease forms a dimer with two protomers oriented 
almost at right angle to each other. Each protomer is 
composed of three domains, which include the N-terminal 
domain I (residues 8-101) and the domain II (residues 


SARS-CoV Protease 


102-184) having an antiparallel B-barrel structure. In contrast, 
the C-terminal domain III (residues 201-303) contains five 
a-helices arranged into a large antiparallel globular cluster, 
and is connected with domain II through a loop region 
(residues 185-200). The Cys-His catalytic dyad is located in 
an active site cleft between domains I and II. 


A. pH Variation of Structures 


At pH 6.0, an analysis of the crystal structure of the 
SARS protease indicates that the protomer A has a structure 
similar to those of the other coronavirus main proteases at 
pH 7.6 (active form), but the protomer B shows the collapsed 
active site due to the lower pH (inactive form). The pH 
profile of the enzyme activity confirmed that the protease is 
fully active at pH above 7, but the activity is dramatically 
decreased at pH 6. As shown in protomer A, the oxyanion 
hole and the N-finger of protomer B docks to its binding site, 
the main-chain NH of Gly143(A) is available for H-bonding 
to the oxyanion intermediate and the side-chain NH, of 
His163(A) imidazole ring is free to bind with P1 Gln of the 
substrate. In the inactive conformation of protomer B, these 
interactions are absent and the N-finger of protomer A is not 
docked to its binding site. This results in the collapse of the 
oxyanion hole with protrusion of F140(B) into the bulk 
solvent and conformational switching of Glu166(B), thereby 
blocking the substrate site. 


B. Substrate-Binding Subsites 


The S1 subsite in protomer A (active form) consists of 
the side chains of His163(A) and Phe140(A) as well as the 
main chains of Metl65(A), Glul66(A) and His172(A). 
Glul166(A) side chain forms a salt bridge with His172(A) 
and also interacts with the amide group of the N-terminal 
Serl from protomer B. The N-finger (Residues 1-7) plays an 
important role in the dimerization and formation of the 
active site. With the hexapeptidyl CMK (Cbz-Val-Asn-Ser- 
Thr-Leu-Gln-CMK) inhibitor bound, a covalent bond is 
formed between the Sy atom of Cys145 and the methylene 
group of the CMK. The structure of the complex at 2.5 A 
resolution reveals an unexpected mode of inhibitor binding 
[21]. In the protomer A (the active form), the side-chain 
carbonyl of P1-Gln accepts a H-bond (2.8 A) from the Ne2 
atom of His163(A). The side chain Ne2 of P1-Gln donates a 
H-bond to the side chain carbonyl of the conserved 
Glu166(A). However, P2-Leu fails to bind to the S2 subsite 
in the vincity of Asp 187 and becomes solvent exposed. This 
noncanonical binding results in a frameshift in the subsite 
interaction: P3-Thr and P5-Asn bind at the S2 and S4 
subsites, respectively. A plausible reason for this observation 
could be due to the fact that the peptide inhibitor used did 
not contain the best-fit sequence. 


C. Protease in the Product-Bound Form 


Beside the frame shift of the substrate-like inhibitor, the 
above structure failed to show a clear electron density at the 
C-terminal. However, the structures solved by Hsu ef al. 
remedied this shortage and revealed a novel product-bound 
form [22]. The C-terminal residues of C145A mutant 
protease are intercalated into the neighboring protomer 
creating a product-bound structure that may resemble the 
intermediate during autoprocessing [22]. The two protomers 
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of dimeric C145A, denoted “A” and “B”, are oriented 
perpendicularly to each other, and each protomer contains 
three domains as those found in the wild-type structure 
solved previously. However, two subunits have asymmetric 
structures and the active site of protomer B is intercalated 
with the C-terminal residues 301 to 306 of protomer B’ 
(shown with cyan ribbon) from the dimer in another 
asymmetric unit (Fig. (3)). The N-terminal residues of the 
protomer A (shown with green ribbon) are located near the 
active site of protomer B. This structure reveals the pathway 
in which the product is bound in the active site during the 
maturation process, and the six amino acids at the C- 
terminus of protomer B’ represent the P6 to PI sites of the 
autoprocessed product. 


In the S1 site, the side-chain Oe1 of Gln306 (P1) forms a 
hydrogen bond with side chain Ne2 of His163. The side- 
chain Ne2 of P1-Gln donates a H-bond to the side-chain 
carboxylate of Glul66. Moreover, the oxygen anion at the 
free carboxylate end of Pl-Gln forms H-bonds with the 
backbone NH atoms of Gly143 and Cys145. If Alal45 is 
replaced by Cys using computer modeling (shown in blue) to 
generate the active form, the Sy atom of Cys145 is at suitable 
position to attack Pl carboxyl group. Residues 140-145 and 
163-166 form the “outer wall” of the S1 site. The S2 site of 
C145A is formed by the main-chain atoms of Val186, 
Asp187, Arg188, and Gln189 as well as the side-chain atoms 
of His41, Met49, and Met165, suggesting that the P2 site 
prefers a bulky side chain such as Val, Leu, or Phe. The N 
atom in the main chain of P2-Phe interacts with the O atom 
of His164, and the side chain interacts with Met49, Met165, 
Asp187 and Arg188 through hydrophobic contacts. Residues 
186-188 line the S2 subsite with some of their main-chain 
atoms. The side chain of P3-Thr is oriented toward bulk 
solvent. The O atom of the Thr accepts a H-bond (2.9 A) 
from the NH of Glu166. Residues Met165, Leu167, Ser189, 
Thr190, and Gln192 surround the S4 subsite which also 
favors a hydrophobic side chain. The main-chain O atom of 
P4-Val accepts a H-bond (3.1 A) from the Ne2 atom of 
Gln189 and the N atom of the Val donates a H-bond to the 
Oe1 atom of Gln189 and another main-chain NH donates a 
H-bond (3.3 A) to Glnl89. The side chain of P4-Val 
interacts with Metl65 and Glnl89 via hydrophobic 
interactions. S5 subsite is composed of the main-chain atoms 
of Thri90, Alal91, and Gln192. P5-Gly is not in contact 
with the protease. The S6 site is almost positioned at the 
outer area of the protein. However, the O atom and Oy of P6- 
Ser still interact with the backbone N and O atoms of 
Gln192. 


INHIBITORS OF THE SARS MAIN PROTEASE 


So far, many inhibitors with low uM and sub-uM 
activities have been identified from high throughput random 
screening and rational design approaches. High throughput 
screening has been performed using the cell-based assay by 
observing the protective effect of the compounds on the 
VeroE6 cell infected by SARS-CoV, or the target (protease)- 
based assay by monitoring the inhibitory activities of the 
compounds on the 3CL® reaction. The compound banks 
used include FDA approved drugs, compounds with 
biological activities, synthetic compound libraries, known 
protease inhibitors, herbal medicine components, natural 
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Fig. (3). A product-bound structure of SARS 3CL”” resulted from the intercalation of the C-terminal six amino acids of protomer B’ into 
protomer B, while protomer A forms a dimeric complex with protomer B. This bound protomer B’ resembles the processed product. 


products and others. Inhibitors from rational design 
approaches are those from the existing inhibitors of the 
human rhinovirus protease but which have been modified to 
fit the active site of the SARS protease, peptidomimetics 
designed from the substrate specificity of the protease, thiol 
chelating compounds targeting the active site Cys, and others 
as described below. 


A. From High Throughput Screening 


Wu et al. have used a Vero cell-based assay to screen 
many agents including about 200 drugs approved by the 
Food and Drug Administration, more than 8000 synthetic 
compounds, about 1000 traditional Chinese herbs, and 
almost 500 protease inhibitors. From the compounds tested, 
about 50 are active anti-SARS-CoV compounds, including 
two existing drugs Reserpine and Aescin [38]. The rationale 
for testing existing drugs with anti-SARS activity is because 
it can save time and money for developing them into anti- 
SARS drugs. These screenings were based on the cell 
cytopathogenic effect, ELSA, Western-blot analysis, 
immunofluorescence and flow cytometry methods. Subse- 
quently, the fluorescence-based assay method using the 
Dabcyl-KTSAVLNSGFRKME-Edans substrate was 
performed to identify compounds that inhibit the protease. 
The compound that was developed as a transition-state 
analogue of the HIV protease (K; = 1.5 nM toward the HIV- 
protease and 4 nM against feline immunodeficiency virus 


protease), was also identified to be active against the SARS 
main protease (K; = 0.6 UM) (see TL-3 in Table 1). Also 
from this study, Lopinavir (one of the two components from 
the anti-AIDS drug Kaletra) was showed to inhibit SARS 
main protease (K; = 15 uM), which is consistent with 
previously observed better clinical outcome for treating 
SARS patients with this drug [33]. 


Besides Reserpine and Aescin, the existing anti-helminthic 
drug, niclosamide (2’,5-dichloro-4’-nitrosalicylanilide), was 
also found to inhibit replication of SARS-CoV [46]. In Vero 
E6 cells, synthesis of viral spike protein and nucleocapsid 
protein was abolished at a niclosamide concentration of 1.6 
uM as revealed by immuno-blot analysis. 


Blanchard et al. used a FRET type (substrate = 2- 
aminobenzoyl-SVTLQSG-Tyr(NO,)-R) high throughput 
screening approach on 50,000 drug-like small molecules to 
find SARS protease inhibitors [40]. Five hundred and 
seventy-two hits were identified from the primary screening. 
By a series of virtual and experimental filters, five novel 
small molecules (MAC-5576, MAC-8120, MAC-13985, 
MAC-22272, and MAC-30731) with ICs) = 0.5-7 uM were 
identified to be the SARS 3CL”” inhibitors as listed in Table 
1. Their data are available for download from the McMaster 
HTS Lab (http://hts.mcmaster.ca/sars). 


Kao et al. screened 50,240 structurally diverse small 
molecules from which they identified 104 compounds with 
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Table 1. SARS Main Protease Inhibitors Obtained from High Throughput Screening 


Structure K;* or Reference 
C50” (uM) 


MAC- 
30731 
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Phenylmercuric 
acetate 


Structure 
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K,;’ or Reference 
C50” (uM) 


Hexachlorophene 


Thimerosal 


anti-SARS-—CoV activity [47]. Of these 104 compounds, 2 
compounds were found to target the SARS 3CL?” by using a 
HPLC-based assay method. One of the compounds (MP576 
shown in Table 1) displayed inhibitory activity with ICs of 
2.5 UM in the protease assay and an ECs of 7 UM in the 
Vero cell-based SARS-CoV plaque reduction assay. 


A compound library containing 960 commercially 
available drugs and biologically active substances was 
screened by Hsu ef al. for inhibition of the SARS-CoV 
3CL* [48]. Three hits, namely phenylmercuric acetate, 
thimerosal, and hexachlorophene (see Table 1 for their 
structures), were discovered in an in-vitro protease assay 
method. They were also effective in suppressing viral 
replication and the synthesis of the viral spike protein. The 
determination of the K; values of phenylmercuric acetate, 
thimerosal, and hexachlorophene against 3CL”” indicate that 
these compounds are competitive inhibitors with K; values of 
0.7 uM for phenylmercuric acetate, 2.4 UM for thimerosal, 
and 13.7 uM for hexachlorophene (Table 1). Phenylmercuric 
acetate and thimerosal are used as pharmaceutical excipients, 
and are widely used as antimicrobial preservatives in 
parenteral and topical pharmaceutical formulations [49]. In 
particular, phenylmercuric acetate is used as an antimicrobial 
preservative in cosmetics, as a bactericide in parenterals and 
eye-drops, and as a spermicide. Hexachlorophene is an 
antibacterial agent that is a common ingredient of soaps and 
scrubs and is experimentally used as a cholinesterase 
inhibitor. The hexachlorophene derivatives were further 
explored as SARS protease inhibitors by Liu ef al. [42]. 


Through screening from a natural product library 
consisting of 720 compounds, Chen ef al. obtained two 
compounds, namely, tannic acid (ICs) = 3 WM) and 3- 
isotheaflavin-3-gallate (TF2B) (Cs) = 7 uM) as potent 


inhibitors of SARS 3CL? [50]. These two compounds 
belong to a group of natural polyphenols in tea, and therefore 
Chen et al. further investigated the 3CL"° inhibitory 
activities of the extracts from different types of tea including 
green tea, oolong tea, Puer tea, and black tea. The results 
obtained indicated that the extracts from Puer tea and black 
tea were more potent in inhibiting SARS protease activity. 
Several known ingredients of black tea were then evaluated 
for anti-protease activity and theaflavin-3,3’-digallate (TF3) 
was found to inhibit the protease (ICso <10 UM). TF3 is 
actually the most abundant (1.05%) theaflavin in black tea 
[51]. 


The first natural product reported to inhibit SARS-CoV 
replication is Glycyrrhizin although it only showed ICs) > 
500 UM [52]. From the database of the International Species 
Information System, the natural compounds whose structures 
have 80% similarities with Glycyrrhizin, Aescin, and 
Reserpine were retrieved [38]. Fifteen compounds were 
found to be structurally related to Glycyrrhizin and Aescin, 
and six compounds to Reserpine. From the cell-based assay 
screening method, four derivatives of Glycyrrhizin and 
Aescin and all six analogues of Reserpine showed anti 
SARS-CoV activity at <100 uM. Among these compounds, 
Ginsenoside-Rb1 is one of the pharmacologically active 
components of the traditional Chinese herb, Panax ginseng 
[53]. However, these are not SARS protease inhibitors and 
their targets are not currently known. 


B. From Rational Design 
1. o,B-Unsaturated Peptidomimetics 


AG7088, a ketomethy] isostere of a tripeptide-conjugated 
ester, is a potent inhibitor of the rhinovirus 3C protease with 


SARS-CoV Protease 


an ECs of 0.013 UM [54-56]. The structure of AG7088 (G = 
CH, and R = 4F-C,H, as shown in Fig. (4)) incorporates a y- 
lactam moiety to mimic the P1 glutamate residue and an «,B- 
unsaturated ester as a Michael acceptor to form a covalent 
linkage with the active site Cys and hence inactivate the 
enzyme. To improve cell membrane permeability, the P2 
phenylalanine residue was replaced with a methylene 
isostere bearing a 4-fluorophenyl substituent. Although the 
model based on the crystal structure of TEGV 3C protease 
predicted that AG7088 can fit into the binding pocket of the 
SARS 3CL?” [18], the compound is actually almost inactive 
towards SARS protease [22]. A series of analogues were 
further prepared to improve the inhibitory potencies. Shie et 
al. found that the replacement of the y-lactam moiety with a 
phenylalanine side chain increased the inhibitory activity 
[57]. These conjugated esters were subjected to the 
inhibition assay using a fluorometric method [25]. As shown 
in Fig. (4), most of the tripeptide-conjugated esters (2 and 4 
series) tend to be more active than the corresponding 
ketomethylene isosteres (1 and 3 series) in inhibiting the 
protease. For example, the compounds 3a—d and 4a—d with a 


NH 
O fe) 
a eo 
R N . N CO,Et 
H : H 
O : 
pa: 


tadG=CH, 2a-d: G= NH 
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phenyl group replacing the Pl lactam moiety showed better 
inhibitory activities (ICso = 11-39 uM) than AG7088 (la) 
and its analogs (1b—d and 2a—d) (ICs9 = 80 uM) [57]. From 
computer modeling, the P1 phenylalanine may shift to bind 
S2 subsite such that the o.,8-unsaturated Michael acceptor is 
beyond the reach of the thiol moiety of Cys145 for forming a 
covalent linkage. 


Since the best inhibitor 4d only has an ICso of 11 UM, 
more analogues were synthesized in microtiter plate and 
their inhibitory activities against SARS protease were 
evaluated. As shown in Table 2, the best inhibitor (type A), 
an analogue of 4d, has an ICs of 1 uM. The compound also 
showed potent activity against SARS-CoV replication by 
blocking the synthesis of viral spike protein [57]. 


2. Anilides 


With L-phenylalanine as the Pl residue, a series of 
peptide anilides were prepared and tested as SARS protease 
inhibitors [58]. Anilide 5 (Fig. (5)) was prepared by 
condensation of 2-chloro-4-nitroaniline with the acyl 


fe) fe) 
ee a, CO.Et 
H z H 
Oo. 
OTe 


3a-dG=CH2 4a-d: G= NH 


a series: R = 4-F-CgH,, R'= 5-methyl-3-isoxazole 
b series: R = 4-F-CgH,, R'= PhCH,O 
c sereis: R = Ph, R'= 5-methyl-3-isoxazole 


d series: R = Ph, R' = PhCH,O 
Fig. (4). Compounds 1a-d to 4a-d. 
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Fig. (5). Compounds 5 to 8. 


R =i-Bu, PhCHy, 4-FC,H,C> 


R' = alkyl, akoxy, aromatic, heterocyclic 
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Table 2. Peptidomimetics and an Isatin as SARS Main Protease Inhibitors 


Inhibitor Type 


a,B-Unsaturated ester 


Structure K;’ or ICs0° (uM) Reference 


Anilide 


0.03* 


Keto-glutamine 
analogue 


Isatin 


chloride derivative of Boc-Phe-OH. Using the previously 
reported amide formation in a microtitre plate [59,60], the 
coupling reactions of a 60-member library of carboxylic 
acids with the amine generated by removal of the Boc group 
from anilide 5 afforded a 60-member library of anilide 6. 
Tripeptide anilides 7a—x (24 members) and tetrapeptide 
anilides 8a—x (24 members) were also created by coupling of 
5 with appropriate peptides. 


Among them, an anilide compound (inhibitor type B, see 
Table 2) derived from 2-chloro-4-nitroaniline, L-phenylala- 
nine and 4-(dimethylamino)benzoic acid is the most potent 
inhibitor, showing a K; of 0.030 UM. Deletion of the chloro, 
nitro or dimethylamino substituents from this compound 
significantly weakened the binding affinity. Also, replacing 
the dimethylamino group with a nitro group caused a 
reduction in inhibitor potency. According to the molecular 


docking, the nitro group of the compound is predicted to be 
hydrogen bonded with the NH of Ala46, while the chlorine 
atom is within 3 A from y-S atom of Cys145 and Ne2 atom 
of His41, therefore providing a possible key interaction with 
the catalytic dyad. 


3. Keto-Glutamine Analogues 


Since the SARS protease recognizes a glutamine residue 
at the P1 site, Jain et al. synthesized and evaluated a series of 
keto-glutamine analogues with a phthalhydrazido group at 
the Q-position as reversible protease inhibitors [61]. Attach- 
ment of a tripeptide (Ac-Val-Thr-Leu) to these glutamine- 
based “warheads” (9) resulted in significant better inhibitors 
(Fig. (6)). N,N-Dimethylglutamine analogues (9) are much 
less potent inhibitors (10—-100-fold larger ICs) than cyclic 
glutamine analogues (10). The best inhibitor (inhibitor type 
C) is shown in Table 2. In the modeling structures, the 
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Ry 
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Fig. (6). Compounds 9 to 10. 


inhibitor is bound in an extended conformation, forming a 
partial B-sheet and a hydrogen bond between His163 and the 
P1 side chain. The modeling studies indicate that the active 
site of the enzyme has enough room to accommodate the 
bulky phthalhydrazido group. Some rearrangements of the 
protein, in particular residue Glul66, are required to 
accommodate the extra bulky group on the P1 residues. 


4. Isatin 


Certain isatin (2,3-dioxindole) derivatives, such as (11) 
shown in Fig. (7), are known potent inhibitors against the 
rhinovirus 3C protease [62]. This isatin core structure offers 
several advantages that include ease of synthesis and 
chemical modification. Its derivatives were tested as 
inhibitors for SARS 3CL””. From a series of synthetic isatin 
derivatives (12) showing ICs) = 0.95-17.5 |M, the best 
inhibitor found is listed in Table 2 (inhibitor type D), which 
has an iodo or bromo in the isatin scaffold [63]. The 
benzothiophenemethy]l side chain provides more inhibitory 
effect than the benzyl, heterocyclic substituted methyl, and 
other alkyl groups. 


Fig. (7). Compounds 11 to 12. 


5. Aryl Boronic Compounds 


Bacha et al. proposed an attractive subsite for the design 
of potent inhibitors, which is a cluster of Ser residues 
(Ser139, Ser144, and Ser147) close to the catalytic residues 
(35]. In fact, this Ser cluster is conserved in all known 
coronavirus proteases and may represent a common target of 
wide-spectrum coronavirus protease inhibitors. Based on the 
known reactivity of boronic acid compounds with the 
hydroxyl group in Ser residues, the inhibitory potency of 
bifunctional boronic acid compounds was evaluated. A 
chemical scafford containing two phenyl boronic groups 
attached to a central aromatic ring by ester linkages were 
tested. This compound has an inhibition constant in the low 
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micromolar range (K; = 4.6 uM). Different variations of the 
compound were prepared, including several isomers with 
replacement of the central aromatic ring with shorter ester 
linkage, and different functionalities at the phenyl boronic 
rings. The highest improvement in affinity was observed 
when the ester linkage between the aromatic rings was 
replaced with an amide group, thereby resulting in 
nanomolar inhibition constants. The structures and K; values 
of the five representative inhibitors are summarized in Table 
3. The inhibitors display a mixed competitive pattern (binds 
to both free enzyme and enzyme-substrate complex). This 
may be due to the large substrate used in the kinetic 
measurements. These compounds inhibited the enzyme in a 
reversible manner. 


6. Metal-Conjugated Inhibitors 


As shown above in high throughput screening of 960 
compounds, two compounds, phenylmercuric acetate and 
thimerosal, identified as potent SARS protease inhibitors 
contain mercury. This suggests to use metal ion as a chelator 
for Cys protease inhibitors since the metal ions such as Hg”*, 
Zn°*, and Cu** have high affinities for the sulfur atom of the 
Cys residue [64,65]. Since mercury-containing compounds 
may pose therapeutic hazards to a patient if orally taken, a 
series of zinc-containing compounds and metal ions were 
evaluated for SARS protease inhibition (Table 4). The most 
potent inhibitor found was 1-hydroxypyridine-2-thione zinc, 
a competitive inhibitor with a K, value of 0.17 uM. The 
involvement of the organic moiety of 1-hydroxypyridine-2- 
thione zinc in inhibitory activity was supported by the 
observations that N-ethyl-N-phenyldithiocarbamic acid zinc 
and toluene-3,4-dithiolato zinc showed similar K; values (1.0 
and 1.4 4M, respectively) to that of the Zn** (1.1 1M). 


Zn°* was previously found to be _ tetrahedrally 
coordinated by three Cys sulfurs and one His nitrogen of the 
2A proteinase from a common cold virus, which is 
responsible for the shut-off of host-cell protein synthesis 
[66]. Zinc-containing compounds such as zinc acetate are 
added as a supplement to the drug for the treatment of 
Wilson’s disease [67], indicating the safety of the ion for 
human use. Zinc salts such as zincum gluconicum 
(Zenullose) may be effective in treating the common cold, a 
disease caused by rhinoviruses, without knowledge of the 
molecular target [68]. Moreover, zinc ions inhibit replication 
of rhinoviruses [69,70]. Thus, the potential use of the zinc- 
conjugated compounds as a therapeutic treatment of the 
SARS disease rests to be explored. 
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Table 3. Bifunctional Aryl Boronic Compounds as SARS 3CL”” Inhibitors [35] 


Structure 


FL-078 


FL-101 


FL-106 
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Table 4. 


Phenylmercuric 
nitrate 
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Metal-Conjugated Compounds and Metal ions as Inhibitors of SARS 3CL””° [48] 


Structure 


1-Hydroxypyridine- 
2-thione zinc 


N-Ethyl-N-phenyldithio- 
carbamic acid zinc 


Toluene-3,4-dithiolato zinc 


Mercuric ion 


Zinc ion 


CONCLUSION 


In addition to the inhibitors identified from high 
throughput screening and rational design approaches, there 
are many other compounds that have been suggested to be 
potential inhibitors of SARS protease based on computer 
modeling [e.g., 71-78]. These compounds are not discussed 
here if they have not been tested in in-vitro assays. Some 
degree of structural similarity of the active site of the SARS 
protease with that of other 3CL proteases exist as some of 
the SARS protease inhibitors are derived from the previously 
identified inhibitors of the rhinovirus 3C protease. The 
newly identified compounds for SARS protease may also 
serve as inhibitors for other viral proteases. The rich 
information of the SARS protease inhibitors could provide a 
therapeutic solution if the SARS disease comes back in the 
future, and could possibly serve as new drug leads for other 
viral diseases. Therefore, further research in this area is 
clearly desirable. 
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ABBREVIATIONS 

SARS = Severe acute respiratory syndrome 
CoV = Coronavirus 

TGEV =  Transmissible gastroenteritis virus 


HCoV = Human coronavirus 


MHV = Mouse hepatitis virus 

BCoV = Bovine coronavirus 

PEDV = Porcine epidemic diarrhea virus 

3CL” =  3C-like protease 

Dabcyl =  4-(4-dimethylaminophenylazo)benzoic 
acid 

Edans = 5-[(2-aminoethy])amino]naphthalene- 1 
sulfonic acid 

FRET = Fluorescence resonance energy transfer 

AUC = Analytical ultracentrifugation 

CMK =  Chloromethylketone 
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